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ABSTRACT 


This  paper  reviews  pixel-based  and  region-based 
( *  structural*)  image  models.  The  former  include  both 
one-dimensional  time  series  and  random  field  models,  with 
the  properties  of  the  field  specified  either  locally  or 
globally. 
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IUemluu  liiiJJ 

Traditionally,  image  models  have  been  classified  as 
statistical  or  structural  [22,48,54].  The  statistical  models 
involve  desription  of  image  statistics  such  as  autocorrela¬ 
tion  etc. ,  while  the  structural  approach  consists  of  speci¬ 
fication  of  structural  primitives  and  placement  rules  for 
laying  these  primitives  out  in  the  plane.  It  should  be  noted 
that  if  the  rules  in  the  structural  approach  are  not  statisti¬ 
cal,  the  resulting  models  should  be  too  regular  to  be  interest¬ 
ing.  Thus  the  structural  models  too  must  in  part  be  statisti¬ 
cal.  A  better  classification  of  image  models  might  be  as 
follows : 

a)  Pixel  based  models:  These  models  view  individual 
pixels  as  the  primitives  of  the  texture.  Specification 
of  the  characteristics  of  the  spatial  distribution  of 
pixel  properties  [22,42]  constitutes  the  texture 
description. 

b)  Region  based  models:  These  models  conceive  of  a  texture 
as  an  arrangement  of  a  set  of  spatial  (sub) patterns 
according  to  certain  placement  rules  [54],  Both  the 
subpatterns  and  their  placement  may  be  statistically 
characterized.  The  subpatterns  may  further  be  made  up 
of  smaller  patterns. 


In  the  following  sections  we  will  discuss  these  two 
classes  of  models  and  review  many  of  the  studies  of  image 
modeling  conducted  through  1978.  It  should  be  emphasized 
that  image  modeling  is  a  rapidly  evolving  field  and  much 
further  work  is  currently  in  progress. 


2. 


Pixel  Based  Models 


Pixel  based  models  can  be  further  divided  into  two  classes: 
2.1.  One-Dimensional  Time  Series  Models 

Time  series  analysis  [10]  has  been  extensively  used  [38, 
60,61]  to  study  visual  textures.  The  image  is  TV  scanned  to 
provide  a  one-dimensional  series  of  gray  level  fluctuations, 
which  is  treated  as  a  one-dimensional  stochastic  process 
evolving  in  "time".  The  future  course  of  the  process  is  pre¬ 
sumed  to  be  predictable  by  knowing  enough  about  its  past. 

Before  summarizing  the  models,  we  review  some  of  the 
commonly  used  notation  in  time  series. 

Let 


*t-l 


Zt  Zt+1 


be  a  discrete  time  series  where  Z^  is  the  value  of  the  random 
variable  Z  at  time  i.  We  denote  the  series  by  [ Z ] . 

Let  u  be  the  mean  of  [Z],  called  the  "level "  of  the  pro¬ 


cess. 

Let  [Z]  denote  the  series  of  deviations  about  u,  i.e., 


Zi  “  Zi-y 

Let  [a]  be  a  series  of  outputs  of  a  white  noise  source, 

2 

with  mean  zero  and  variance  a. 

cl 

Let  B  be  the  "backward"  shift  operator  such  that 

3  Zfc  *  Z  fi  ?  hence 

B™  Zt  =  Z  ; 
u  t-m 


and  let  V  be  the  "backward"  difference  operator  such  that 


vzt  =  Zt-Zt_1  =  (l-B)Zfc; 


hence  ?mZt  =  (l-3)mZt 


The  dependence  of  the  current  value  Z^  of  the  random  variable 

v. 

2  on  the  past  values  of  2  and  a  is  expressed  in  differ¬ 
ent  ways* and  this  gives  rise  to  several  different  models  [381. 


(a)  Autoregressive  Model  (AR)  : 

In  this  model  the  current  Z-value  depends  on  the  pre¬ 
vious  p  Z-values,  and  on  the  current  noise  term: 


Zt  * 


Vt-1  +  *2Zt-2+'  *  '+*pZt-p  +  at 


f  1) 


If  we  let 

$  (B)  ^  l-^B-^B2-.  .  .-$p3? 
then  (1)  becomes 


[$p (B)  ]  (Zfc)  =  afc 

[Z]  ,  as  defined  above,  is  known  as  the  autoregressive 
process  of  order  p,  and  $^(3)  as  the  autoregressive  operator 
of  order  p.  The  name  "autoregressive"  comes  from  the  model's 
similarity  to  regression  analysis,  and  the  fact  that  the  vari¬ 
able  Z  is  being  regressed  on  previous  values  of  itself. 

(b)  Moving  Average  Model  (MA) : 

In  (a)  above,  Zfc_^  can  be  eliminated  from  the  ex¬ 
pression  for  Z t  by  substituting 


zt-i  *  Vt-2  *  «; 


't-3 


.  .+  a  Z .  .  +  a,.  . 

p  t-p-1  t-1 


This  process  can  be  repeated  to  yield  eventually  an  expression 
for  Zfc  as  an  infinite  series  in  the  a's. 

The  moving  average  model  allows  a  finite  number  q  of 
previous  a-values  in  the  expression  for  Zfc.  This  explicitly 
treats  the  series  as  being  observations  on  linearly  filtered 
Gaussian  noise. 

Letting 

0(B)  =  1-9.  B-0-B2-  .  .  .- 9  Bq, 
q  12  q 

we  have 


=  [0(3) ] (at) 

as  the  moving  average  process  of  order  q. 

(c)  Mixed  Model  (ARMA) : 

To  achieve  greater  flexibility  in  fitting  of  actual 
time  series,  this  model  includes  both  the  autoregressive  and 
the  moving  average  terms .  Thus 


Zt  =  *lZt-l  +  *2Zt-2  +*‘*+  Vt- 


+  ai_-0.a^  .-0„a^  a. 


■  -  I  W  -t  ti  ,  -1  V  A  U  ,  A  •  •  •  W  >-*  I 

p  t-p  t  1  t-1  2  t-2  q  t-q 


i.e.  ,  [<J>p(B)  ]  ( Z t )  =  [0  (B)  ]  ( a t ) 


\2) 


In  all  the  three  models  just  mentioned,  the  process 
generating  the  series  is  assumed  to  be  in  equilibrium  about  a 
constant  mean  level.  Such  models  are  called  stationary  models. 

There  is  another  class  of  models  called  non- 
stationarv  models,  in  which  the  level  u  does  not  remain  con¬ 
stant.  The  series  involved  may,  nevertheless,  exhibit  homogeneous 


behavior  when  the  differences  due  to  level-drift  are  accounted 


for.  It  can  been  shown  [10]  that  such  a  behavior  may  be  repre¬ 
sented  by  a  generalized  autoregressive  operator. 

A  time  series  may  show  a  repetitive  pattern  of  periods 
of  similar  characteristics.  For  example,  in  the  TV  scan  of 
an  image  the  intervals  corresponding  to  rows  will  have  similar 
characteristics.  A  generalized  model  that  incorporates  the 
presence  of  such  "seasonal  effects"  in  the  time  series  can 
also  be  obtained  [38]. 

All  of  the  time  series  models  discussed  above  are  uni¬ 
lateral,  i.e.,  a  pixel  depends  only  upon  the  pixels  that  pre¬ 
cede  it  in  a  TV  scan.  Any  introduction  of  bilateral  dependence 
gives  rise  to  more  complex  parameter  estimation  problems,  even 
though  both  conditional  representations  are  known  to  be  essen¬ 
tially  identical  [9,12].  It  may  be  of  interest  to  note  that 
a  frequency  domain  treatment  makes  parameter  estimation 
in  bilateral  representation  much  easier  [13]. 


2.2. 


Random  Field  Models 


These  models  treat  the  image  as  a  two-dimensional  random 
field  [53,64],  The  models  make  use  of  the  properties  of  the 
grid  that  defines  the  pixel  locations.  We  will  consider  two 
subclasses  of  these  models. 

2.2.1.  Global  Models 

Global  models  attempt  a  description  of  the  field  by  speci¬ 
fying  a  process  that  can  be  used  to  obtain  a  realization  of 
the  set  of  gray  level  values  at  various  pixels,  or  by  speci¬ 
fying  particular  properties  of  the  field. 

An  important  model  has  been  used  by  oceanographers  [31-33, 
49]  interested  in  the  patterns  formed  by  waves  on  the  ocean 
surface.  Longuet-Higgins  [31-33]  treats  the  ocean  surface  as 
a  random  field  satisfying  the  following  assumptions: 

(a)  the  wave  spectrum  contains  a  single  narrow  band  of 
frequencies,  and 

(b)  the  wave  energy  is  being  received  from  a  large  number 
of  different  sources  whose  phases  are  random. 

Considering  such  a  random  field,  he  obtains  [32]  the 
statistical  distribution  of  wave  heights,  and  derives  rela¬ 
tions  between  the  root  mean  square  wave  height,  the  mean  height 
of  the  highest  p%  of  the  waves,  and  the  most  likely  height  of 
the  largest  wave  in  a  given  interval  of  time. 

In  subsequent  papers  [31,32],  Longuet-Higgins  obtains  an 
additional  set  of  statistical  relations  among  the  parameters 


describing  (a)  a  random  moving  Gaussian  surface  [31],  and  (b) 
a  Gaussian  isotropic  surface  [32]. 

Some  of  the  results  that  he  derives  are: 

(1)  the  probability  distribution  of  the  surface  elevation, 
and  that  of  the  magnitude  and  orientation  of  the 
gradient, 

(2)  the  average  number  of  zero  crossings  per  unit  distance 
along  a  line  in  an  arbitrary  direction, 

(3)  the  average  length  of  contour  per  unit  area, 

(4)  the  average  density  of  maxima  and  minima  per  unit 
area,  and 

(5)  for  a  narrow  spectrum,  the  probability  distribution 
of  the  heights  of  maxima  and  minima. 

All  the  results  are  expressed  in  terms  of  the  two- 
dimensional  energy  spectrum  up  to  a  finite  order  only.  The 
converse  of  the  problem  is  also  studied  and  solved,  i.e., 
given  certain  statistical  properties  of  the  surface,  to  find 
a  convergent  sequence  of  approximations  to  the  energy  spectrum. 

The  analogy  between  this  work  and  image  processing,  and 
the  significance  of  the  results  obtained  therein,  is  obvious. 
Fortunately  the  assumptions  made  are  also  acceptable  for  images. 

Schachter  [57]  suggests  a  version  of  the  above  model  for 

i 

the  case  of  a  narrow  band  spectrum.  Panda  [47]  uses  an  ana¬ 
logous  approach  to  analyze  background  regions  selected  from 
Forward  Looking  InfraRed  (FLIR)  imagery.  He  derives  expressions 


for  (a)  density  of  border  points  and  (b)  average  number  of 
connected  components  in  a  row  of  the  thresholded  picture. 

There  is  good  agreement  between  the  observed  and  the  predicted 
values  in  most  cases,  for  most  of  the  pictures  considered. 

Panda  [46]  also  uses  the  same  model  to  predict  the  properties 
of  the  pictures  obtained  by  running  several  edge  operators 
(based  on  differences  of  average  gray  levels)  on  some  synthetic 
pictures  with  normally  distributed  gray  levels,  and  having 
different  correlation  coefficients.  The  images  are  assumed 
to  be  continuous-valued  stationary  Gaussian  random  fields 
with  continuous  parameters. 

Nahi  and  Jahanshashi  [43]  suggest  modelling  the 
image  as  a  background  statistical  process  combined  with  a  set 
of  foreground  statistical  processes,  each  replacing  the  back¬ 
ground  in  the  regions  occupied  by  the  objects  of  the  category 
which  it  is  assumed  to  characterize.  In  estimating  the  boun¬ 
daries  of  horizontally  convex  objects  on  a  background  in 
noisy  binary  pictures,  Nahi  and  Jahansnahi  assume  that 
the  two  kinds  of  regions  in  the  picture  are  formed  by  two 

statistically  independent  stationary  random  processes  with 
known  (estimated)  first  two  moments.  However,  the  borders 
of  the  regions  covered  by  the  different  statistical  processes 
are  modelled  locally.  Specifically,  the  end-points  of  the 
intercepts  of  the  given  object  on  successive  rows  are  assumed 
to  form  a  first  order  Markov  process.  This  model  thus  also 
involves  local  interactions. 


Thus,  using  the  notation 

b(m,n)  =  gray  level  at  the  nth  column  of  the  mth  row 
Y(m,n)  =  a  binary  function  carrying  the  boundary  information 

b^  =  a  sample  gray  level  from  the  background  process, 

bQ  =  a  sample  gray  level  from  the  object  process,  and 

v  =  a  sample  gray  level  from  the  noise  process, 

the  model  allows  us  to  write 

b(m,n)  =  Y(m,n)  bQ(m,n)  +  [1-Y(m,n)]  b^(m,n)  +•  v(m,n) 

where  Y  incorporates  the  Markov  constraints  on  the  object 
boundaries . 

In  a  subsequent  paper  Nahi  and  Lopez-Mora  [44]  use  a  more 
complex  y  function.  For  each  row,  y  either  indicates  the  absence 
of  the  object  or  provides  a  vector  estimate  of  the  object  width 
and  its  geometric  center  in  that  row.  The  two-dimensional 
vector  possesses  information  about  the  object  size  and  skewness, 
and  is  assumed  to  be  a  first-order  Markov  process. 

Pratt  and  Faugeras  [50]  and  Gagalowicz  [17]  view  texture 
as  the  output  of  a  homogeneous  spatial  filter  excited  by  white 
noise,  not  necessarily  Gaussian.  The  image  is  then  characterized 
by  its  mean,  the  histogram  of  the  input  white  noise,  and  the 
transfer  function  of  the  filter.  For  a  given  texture,  the  model 
parameters  are  obtained  as  follows: 

-  The  mean  is  readily  estimated  from  the  image. 

-  Computing  the  autocorrelation  function  (second-order 
moments)  determines  the  magnitude  of  the  transfer  function. 


-  Computing  higher-order  moments  determines  the  phase  of 
the  transfer  function. 

Inverse  filtering  gives  the  white  noise  image  and  hence  its 
histogram  and  probability  density.  For  example,  for  a  Markov 
field  of  order  1  it  may  be  sufficient  to  replace  the  decorrela¬ 
tion  operator  by  a  Laplacian,  or  by  gradient  operators  [50]. 
However,  the  whitened  field  estimate  of  the  independent  iden¬ 
tically  distributed  noise  process  obtained  above  will  identify 
only  the  spatial  operator  in  terms  of  the  autocorrelation  func¬ 
tion,  which  is  not  unique.  Thus  the  white  noise  probability 
density  and  the  spatial  filter  do  not,  in  general,  make  up  a 
complete  set  of  descriptors  [51].  But  it  may  be  possible  that 
they  are  sufficient  descriptors  from  the  standpoint  of  visual 
texture . 

Several  authors  have  proposed  models  for  random  surfaces 
or  random  height  fields  [2,16,35].  In  a  discussion  on  surface 
patterns  in  geography  Freiberger  and  Grenander  [16]  argue  that 
the  earth  height  field  is  usually  too  irregular  to  be  described 
by  an  analytic  function  of  the  coordinates  with  a  small  number 
of  free  parameters.  However  the  irregularity  cannot  be  expressed 
by  pure  randomness  either  since  it  is  characterized  by 
strong  continuity  properties.  He  therefore  suggests  the  use 
of  stochastic  processes  derived  from  physical  principles. 
Mandelbrot  [35]  and  Adler  [2]  discuss  a  Brownian  surface  model. 


The  representations  of  signals  in  one-dimensional  signal 
processing  that  yield  recursive  solutions  motivate  the  use  of 
differential  (difference)  equations  in  two  dimensions  [29]. 

Jain  [29]  represents  images  by  random  fields  of  one  of  three 
different  kinds,  characterized  by  the  three  different  classes 
of  partial  differential  equations,  describing  a  digital  shape 
by  an  appropriate  finite  difference  approximation  of  a  partial 
differential  equation  (PDE) .  The  class  of  hyperbolic  PDE ' s 
is  shown  to  provide  more  general  causal  models  than  autore¬ 
gressive  moving  average  models.  For  a  given  spectral  density 
function  (or  covariance  function),  parabolic  PDE ' s  can  provide 
causal,  semicausal,  and  even  noncausal  representations.  Finally, 
elliptic  PDE's  provide  noncausal  models  that  represent  two- 
dimensional  discrete  Markov  fields.  They  can  be  used  to  re¬ 
present  both  isotropic  and  nonisotropic  images. 

Jain  [29]  argues  that  the  well  established  theory  of  PDE's 
and  their  numerical  solutions  and  the  availability  of  many 
computer  algorithms  make  PDE  representation  useful.  This 
representation  also  obviates  the  need  for  spectral  factoriza¬ 
tion  which  removes  the  restriction  of  separate  covariance  func¬ 
tions.  System  identification  techniques  may  be  considered  for 
choosing  a  PDE  model  for  a  given  class  of  images. 

Angel  and  Jain  [8]  use  the  diffusion  equation  to  model 
the  spread  of  values  around  any  given  point.  Thus  a  given 
image  is  viewed  as  a  blurred  version  of  some  original  image. 


In  the  absence  of  any  knowledge  or  assumption  about  the 
global  process  underlying  a  given  image,  one  may  attempt  to 
describe  the  joint  probability  density  of  the  properties  (say, 
gray  level)  of  the  pixels,  although  this  may  be  an  overspeci- 
fication,  i.e.,  the  modeling  may  not  represent  enough  abstrac¬ 
tion.  It  also  implies  estimation  of  the  spatial  probability 
density  functions  of  gray  levels,  which  means  inference  on 
the  joint  probability  density  of  a  large  number  of  random 
variables  corresponding  to  the  pixels  in  the  entire  image. 

To  make  the  problem  a  little  simpler,  attempts  have  been  made 
to  use  parametric  models  where  the  form  of  the  probability 
density  is  assumed,  or  to  model  the  field  density  by  specify¬ 
ing  some  "important"  properties  of  the  field  that  may  correspond 
to  more  than  one  probability  density  function. 

Among  parametric  models  of  the  joint  density  of  pixels  in 
a  window,  the  multivariate  normal  has  been  the  one  most  common¬ 
ly  used  because  of  its  tractability .  However,  it  has  been 
found  to  have  limited  applicability.  For  binary  patterns, 

Abend  et  al.  [1]  discuss  an  iterative  procedure  to  obtain  an 
approximate  estimate  of  the  joint  probability  density  function 
of  the  properties  of  pixels  having  a  multivariate  normal 
distribution,  in  terms  of  lower  order  marginals  of  this  dis¬ 
tribution.  They  argue  that  the  multivariate  normal  approach 
is  very  limiting  and  that  it  requires  special  development  when 
the  sample  covariance  matrices  are  singular.  Furthermore, 


the  lower  order  marginals  themselves  have  to  be  estimated 
based  on  samples  which,  in  practice,  are  usually  not  numerous. 

Hunt  [25,26]  also  points  out  that  stationary,  Gaussian 
modeling  of  images  is  an  oversimplification.  Consider  the 
vector  F  of  the  picture  points  obtained  by  concatenating  them 
as  in  a  TV  scan.  Let  Rp  be  the  covariance  matrix  of  the  gray 
levels  in  F.  Then  according  to  the  Gaussian  assumption,  the 
probability  density  function  of  F  is 

P(F)  =  K  exp  [-  j(F-F)TR^1 (F-F)  ] 
where  F  =  constant  mean  vector 

Rp  =  covariance  matrix 
and  K  =  normalizing  constant 

The  stationarity  assumption  makes  F  a  vector  of  identical 
components.  This  means  that  each  point  in  the  image  has  the 
same  ensemble  statistics.  Images,  however,  seldom  have  a 
bell-shaped  histogram. 

A  Gaussian  model  for  any  set  of  multivariate  data,  how¬ 
ever,  is  the  only  model  that  is  mathematically  tractable  to 
any  reasonable  extent.  Hunt  [25]  proposes  a  nonstationary 
Gaussian  model  which  differs  from  the  stationary  model  only 
in  that  the  mean  vector  F  has  unequal  components.  He  shows 
the  appropriateness  of  this  model  by  subtracting,  from  each 
point  on  the  image,  its  local  ensemble  average,  and  showing 
that  the  resulting  picture  fits  a  stationary  Gaussian  model. 
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Trussel  and  Kruger  [62]  show  that  the  Laplacian  density 
function  constitutes  a  more  valid  model  for  high-pass  filtered 
imagery  than  the  Gaussian  model.  They  show  that  this  discrepancy 
neither  seriously  weakens  the  applicability  of  this  class  of 
models  to  a  major  restoration  method,  nor  challenges  any  other 
conclusions  of  the  work  based  on  the  Gaussian  model. 

Matheron  [37]  uses  the  change  in  pixel  properties  as  a 
function  of  distance  to  model  a  random  field.  He  uses  the 
term  "regionalized  variables"  to  emphasize  the  particular  features 
of  the  pixels  whose  complex  mutual  correlation  reflects  the 
structure  of  the  underlying  phenomenon.  He  assumes  weak  sr.at- 
ionarity  of  the  increments  in  the  gray  levels  between  pixels. 

The  second  moment  of  the  increments  for  pixels  at  an  arbitrary 
distance,  called  the  variogram,  is  used  to  reflect  the  structure 
of  the  field.  Knowledge  of  the  variogram  is  useful  for  the 
estimates  of  many  global  and  local  properties  of  the  field. 
Huijbregts  [24]  discusses  several  properties  of  the  variogram 
and  relates  them  to  the  structural  features  of  the  regionalized 
variables.  For  nonhomogeneous  fields  having  spatially  varying 
mean,  the  variogram  of  the  residuals  with  respect  to  the  local 
means  is  used. 

A  characterization  similar  to  the  variogram  is  given  by  the 
autocorrelation  function.  In  work  on  image  restoration,  images 
have  often  been  modelled  by  a  two-dimensional  random  field  with 


a  given  mean  and  autocorrelation.  The  following  general 


expression  has  been  suggested  for  the  autocorrelation  function 
R(Vt2>  -  o2.  p[-“llTll-^Tll1 

which  is  stationary  and  separable.  Specifically,  the  expo¬ 
nential  autocorrelation  function  (p=e)  has  been  found  to  be 
reasonably  good  for  a  variety  of  pictorial  data  [15,18,23, 
27,30].  \ 

Another  autocorrelation  model  often  cited  as  being  more 
realistic  is 

/t  5  +  -T-  2 

R ( T i ' X2 )  =  P  1  2 

which  is  isotropic,  rotation  invariant  and  not  separable. 


2.2.2.  Local  Models 


A  simplification  that  could  be  introduced  to  reduce  the 
problems  involved  in  the  joint  probability  specification  for 
the  entire  image,  as  is  necessary  for  the  global  models,  is 
to  assume  that  not  all  points  in  an  image  are  simultaneously 
constrained  by  a  high-dimensional  probability  density  func¬ 
tion,  but  that  this  is  only  true  of  small  neighborhoods  of 
pixels.  However,  even  for  a  neighborhood  of  size  3x3  (or  5x5) 
and  nonparametric  representation  one  has  to  deal  with  den¬ 
sities  in  a  9  (or  25)  dimensional  space,  along  with  the 
associated  sample  size  and  storage  problems.  This  makes  the 
approach  unwieldy. 

Read  and  Jayaramamurthy  [52]  and  McCormick  and  Jayaramamurthv 

[39]  make  use  of  switching  theory  techniques  to  identify  textures 

by  describing  their  local  gray  level  patterns  using  minimal 

functions.  If  each  pixel  can  take  one  out  of  N  qr?'r  levels 

g 

then  a  given  neighborhood  of  n  pixels  from  an  image  can  be 

represented  by  a  point  in  an  nxN^  dimensional  space.  If  many 
such  neighborhoods  from  a  given  texture  are  considered  then 
they  are  likely  to  provide  a  cluster  of  points  in  the  above 
space.  The  differences  in  the  local  characteristics  of  different 
textures  are  expected  to  result  in  different  clusters.  The  set 
covering  theory  of  Michalski  and  McCormick  [40]  ,  which  is  a 
generalization  of  the  minimization  machinery  of  switching 
theory  already  available,  is  used  [39,52]  to  describe  the  sets 
of  points  in  each  cluster.  These  maximal  descriptions  also 


allow  coverage  of  empty  spaces  within  and  around  clusters,  and 
thus  the  samples  do  not  have  to  be  exhaustive  but  only  have  to 
be  large  enough  to  provide  a  good  representation  of  the  under¬ 
lying  texture. 

Haralick  et  al .  [20]  confine  the  local  descriptions  to  2x1 
neighborhoods.  They  identify  a  texture  by  the  gray-level  cooc¬ 
currence  frequencies  at  neighboring  pixels,  which  are  the  first 
estimates  of  the  corresponding  probabilities.  They  use  several 
different  features,  all  derived  from  the  co-occurrence  matrix, 
for  texture  classification. 

Most  of  the  local  models,  however,  use  conditional  pro¬ 
perties  of  pixels  within  a  window,  instead  of  their  joint 
probability  distributions  as  in  the  local  models  discussed 
above.  We  will  now  discuss  these  Markov  models  that  make 
a  pixel  depend  upon  its  neighbors. 

Time  series  analysis  for  the  one-dimensional  models  dis¬ 
cussed  earlier  can  also  be  used  to  capture  part  of  the  two- 
dimensional  dependence,  without  getting  into  the  analytical 
problems  arising  from  a  bilateral  representation.  Tou  et  al. 

[60]  have  done  this  by  making  a  point  depend  on  the  points  in 
the  quadrant  above  it  and  to  its  left.  For  such  a  case,  the 
autoregressive  process  of  order  (q,p)  is 


Z  .  .  =  <b  „ ,  Z  .  ,  +  $ ,  «  Z .  ,  +0,,  z.  ,  .  +...+$  z. 

i]  ^  oi  1,3-1  io  i-i,]  11  i-i, 3-1  qp  i-q,j-p 


the  moving  average  process  of  order  (q,p)  is 


^i j  “  aij  "  901ai,j-l  '  910  ai-l, j  ”  ®11  ai-l,j-l 


'•••'  ®qp  ai-q,3-P; 


and  the  two-dimensional  mixed  autoregressive/moving  average 
process  is 

Z  .  .  =  <b  „ ,  Z  .  ,  +  d> ,  „  Z  .  ,  .  +  <fc ,  ,  Z.  .  i  +...+$  Z. 

lj  V01  i,3-l  v10  i-l, j  11  i-l, 3-I  ap  i~q, 3~P 
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The  model,  in  general,  gives  a  nonseparable  autocorrelation 
function.  If  the  coefficients  of  the  process  satisfy  the 
condition 

pmn  "  pm0  ^0n 

then  the  process  becomes  a  multiplicative  process  in  which  the 
influence  of  rows  and  columns  on  the  autocorrelation  is 
separable.  Thus 

pi j  =  Pi0  p0j 

Tou  et  al .  consider  fitting  a  model  to  a  given  texture. 
The  choice  among  the  autoregressive,  moving  average  and  mixed 
models,  as  well  as  the  choice  of  the  order  of  the  process,  is 
made  by  comparing  the  behavior  of  some  observed  statistical 
orocerties ,  e.g. ,  the  autocorrelation  function,  with  that  pre— 


l 


i 


dieted  by  each  of  the  different  models.  For  each  of  the  possibly 
many  choices  of  models,  the  values  of  the  parameters  are  deter¬ 
mined  so  as  to  minimize,  say,  the  least  square  error  in  fit. 

A  comparison  of  the  predictions  of  autocorrelation  func¬ 
tion,  results  of  transformations  of  the  series,  etc.,  based 
upon  the  model  obtained  above,  with  similar  properties  of  the 
available  data  can  be  used  to  establish  its  appropriateness, 
or  to  suggest  desirable  modifications  in  the  model,  e.g., 
changing  the  order,  etc. 

In  a  subsequent  paper,  Tou  and  Chang  [61]  use  the  maxi¬ 
mum  likelihood  principle  to  optimize  the  values  of  the  para¬ 
meters,  in  order  to  obtain  a  refinement  of  the  preliminary 
model  as  suggested  by  the  autocorrelation  function. 

A  bilateral  dependence  in  two  dimensions  is  more  complex 
as  compared  to  the  one-dimensional  case  discussed  earlier. 

Once  again,  this  general  formulation  has  a  unilateral  counter¬ 
part;  for  example,  making  a  point  depend  on  the  points  in  the 
rows  above  it,  as  well  as  the  points  to  its  left  on  its  own 
row.  However,  Whittle  [63]  gives  the  following  reasons  in 
recommending  working  with  the  original  two-dimensional  model; 

1)  The  dependence  on  a  finite  number  of  lattice  neighbors, 
for  example  a  finite  autoregression  in  two  dimensions, 
may  not  always  have  a  unilateral  representation  that 
is  also  a  finite  autoregression. 
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2)  The  real  usefulness  of  the  unilateral  representation 

is  that  it  suggests  a  simplifying  change  of  parameters. 
For  most  two-dimensional  models,  however,  the  appro¬ 
priate  transformation,  even  if  evident,  is  so  compli¬ 
cated  that  nothing  is  gained  by  performing  it.  It 
may  be  pointed  out  that  frequency  domain  analysis  for 
parameter  estimation  [13]  may  prove  useful  here  too. 

Two-dimensional  Markov  random  fields  have  been  investigated 
for  representing  textures.  A  wide  sense  Markov  field  repre¬ 
sentation  aims  at  obtaining  linear  dependence  of  a  pixel  pro¬ 
perty,  say  its  gray  level,  on  the  gray  levels  of  certain  other 
pixels  so  as  to  minimize,  say,  the  mean  square  error  between 
the  actual  and  the  estimated  values  such  that  the  error  terms 
of  various  pixels  are  uncorrelated  random  variables.  A 
strict  sense  Markov  field  representation  involves  specification 
of  the  probability  distribution  of  the  gray  level  given  the 
gray  levels  of  certain  other  pixels.  Although  processes  of 
both  these  types  have  been  investigated,  more  experimental 
work  has  been  done  on  the  former. 

Woods  [65]  shows  that  the  strict  sense  Markov  field  differs 
from  a  wide  sense  field  only  in  that  the  error  variables  in  the 
former  have  a  specific  correlation  structure,  whereas  the 
errors  in  the  latter  are  uncorrelated.  He  points  out  the 
restriction  on  the  nonwhite  noise  (error)  process  driving 


the  strict  sense  model  that  yields  a  recognizable  field.  The 
condition  under  which  a  general  noncausal  Markov  independence 
reduces  to  a  causal  one  is  also  specified. 

Abend  et  al. [1]  introduce  Markov  meshes  to  model  depend¬ 
ence  of  a  pixel  on  a  certain  immediate  neighborhood.  The  joint 

probability  density  for  the  entire  image,  then,  is  the  product 
of  local  conditional  probability  densities  at  each  pixel. 

Using  Markov  chain  methods  on  the  sequences  of  pixels  from 
various  causal  dependency  neighborhoods  of  a  pixel  they  show 
that  in  many  cases  such  a  causal  dependence  translates  into  a 
noncausal  dependence.  For  example,  the  dependence  of  a  pixel 
on  its  west,  northwest  and  north  neighbors  translates  into  izs 
dependence  upon  all  its  eight  neighbors.  Interestingly,  the 
causal  neighborhood  that  results  in  a  4-neighbor  noncausal 
dependence  is  not  known  in  the  formulation,  although  in  the 
Gauss  Markov  formulation  of  Woods  [65]  such  an  explicit 
dependence  is  allowed.  In  this  sense  Woods'  definition  of  a 
Markov  field  is  more  general  than  the  Markov  meshes  of  Abend 
et  al.  [1] . 

Hassner  and  Sklansky  [21]  also  discuss  a  Markov  random 
field  model  for  images.  They  present  an  algorithm  that  generates 
a  texture  from  an  initial  random  configuration  and  a  set  of 
independent  parameters  that  specify  a  consistent  collection  of 
nearest  neighbor  conditional  probabilities  which  characterize 
the  Markov  random  field. 


Deguchi  and  Morishita  [14]  use  a  noncausal  model  for  l he 
dependence  of  a  pixel  on  its  neighborhood  centered  at  the  pixel. 
The  weights  are  determined  by  minimizing  the  mean  square  esti¬ 
mation  error.  The  optimal  two-dimensional  estimator  character¬ 
izes  the  texture.  They  use  such  a  characterization  for  classi¬ 
fication  and  for  segmentation  of  images  consisting  of  more 
than  one  textural  region. 

Jain  and  Angel  127]  use  4-neighbor  autoregression  to  model 
a  given  autocorrelation  function,  not  necessarily  separable. 

They  obtain  values  of  the  autoregression  coefficients  in  terms 
of  the  desired  autocorrelation  function,  which  does  not  have 
to  be  separable.  However,  their  representation  involves  error 
terms  that  are  uncorrelated  with  each  other  or  with  the  non- 
noisy  pixel  gray  level  values.  As  pointed  out  by  Panda  and 
Kak  [45]  ,  these  two  assumptions  about  the  error  terms  are 
incompatible  for  Markov  random  fields.  [65]. 

Jain  and  Angel  [27]  point  out  that  a  4 -neighbor  Markov 
dependence  can  represent  a  large  number  of  physical  processes 
such  as  steady  state  diffusion,  random  walks,  birth  and  death 
processes,  etc.  They  also  propose  8-neighbor  [27]  and  5- 
neighbor  (the  8  neighbors  excluding  the  northeast,  east, 
and  southeast  neighbors)  [27,28]  models. 

Wong  [64]  discusses  the  characterization  of  second  order 
random  fields  (having  finite  first  and  second  moments)  from  the 


point  of  view  of  their  possible  use  in  representing  images. 

He  considers  various  properties  of  a  two-dimensional  random 
field,  and  their  implications  in  terms  of  its  second-order 
properties.  Some  of  the  results  he  obtains  are  as  follows: 

(1)  There  is  no  continuous  Gaussian  random  field  of  two 
dimensions  (or  higher  dimensions)  which  is  both 
homogeneous  and  Markov  (degree  1) . 

(2)  If  the  covariance  function  is  invariant  under 
translation  as  well  as  rotation,  then  it  can  only 
depend  upon  the  Euclidian  distance.  The  second- 
order  properties  of  such  fields  (Wong  calls  them 
homogeneous)  are  character izable  in  terms  of  a 
single  one-dimensional  spectral  distribution. 

Wong  generalizes  his  notion  of  homogeneity  to  include 
random  fields  that  are  not  homogeneous,  but  can  be  easily 
transformed  into  homogeneous  fields.  Even  this  generalized 
class  of  fields  is  no  more  complicated  than  a  one-dimensional 
stationary  process. 

Lu  and  Fu  [34]  identify  the  repetitive  subpatterns  in 
some  highly  regular  textures  from  Brodatz  [11]  and  design  a 
local  descriptor  of  the  subpattern  in  an  enumerative  way  by 
generating  each  of  the  pixels  in  the  window  individually. 

The  subpattern  description  is  done  by  specifying  a  grammar 
whose  productions  generate  a  window  in  several  steps.  For 
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example,  starting  from  the  top  left  corner  rows  may  be  generated 
by  a  series  of  productions,  while  other  productions  will 
generate  individual  pixels  within  the  rows.  The  grammar  used 


Region  Based  Models 


The  next  few  models  use  the  notion  of  a  structural 
primitive,  although  both  the  shapes  of  the  primitives  and 
the  rules  to  generate  the  textures  from  the  primitives  may 
be  specified  statistically. 

Matheron  [36]  and  Serra  [58]  propose  a  model  that  views 
a  binary  texture  as  produced  by  a  set  of  translations  of  a 
structural  element.  All  locations  of  the  structural  elements 
such  that  the  entire  element  lies  within  the  foreground  of  the 
texture  are  identified.  Note  that  there  may  be  (narrow)  regions 
which  cannot  be  covered  by  any  placement  of  the  structural 
element,  as  all  possible  arrangements  of  the  element  that  cover 
a  given  region  may  not  lie  completely  within  the  foreground. 

Thus  only  an  "eroded"  version  of  the  image  can  be  spanned  by 
the  structural  element  which  is  used  as  the  representation  of 
the  original  image.  Textural  properties  can  be  obtained  by 
appropriately  parameterizing  the  structure  element.  It  is  in¬ 
teresting  to  note  that  for  a  structural  element  consisting  of 
two  pixels  at  distance  d,  the  area  of  the  eroded  image  is  the 
value  of  the  autocovariance,  at  distance  d,  of  the  original 
image.  More  complicated  structural  elements  would  provide  a 
generalized  autocovariance  function  which  has  more  structural 
information.  Matheron  and  Serra  show  how  the  generalized  co- 
variance  function  can  be  used  to  obtain  various  texture  features 


Zucker  '[67]  conceives  of  a  real  texture  as  being  a  distortion 
of  an  ideal  texture  which  is  a  spatial  layout  of  primitives  as 
cells  in  a  regular  or  semiregular  tessellation.  Certain  trans¬ 
formations  are  applied  to  the  primitives  to  distort  them  to  pro¬ 
vide  a  realistic  texture.  The  statistical  nature  of  the  texture 
can  be  provided  through  these  transformation  rules. 

Yokoyama  and  Haralick  [66]  describe  a  growth  process  to 
synthesize  textures.  Their  method  consists  of  the  following 
steps : 

a)  Mark  some  of  the  pixels  in  a  clean  image  as  seeds. 

b)  The  seeds  grow  into  curves  called  skeletons. 

c)  The  skeletons  thicken  to  become  regions. 

d)  The  pixels  in  the  regions  thus  obtained  are  transformed 
into  gray  levels  in  the  desired  range. 

3)  A  probabilistic  transformation  is  applied,  if  desired, 
to  modify  the  gray  level  cooccurrence  probability  in 
the  final  image. 

The  distribution  processes  in  (a)  and  the  growth  processes 
in  (b)  and  (c)  can  be  deterministic  or  random.  Yokoyama  and 
Haralick' s  method,  however,  sums  up  to  an  ad  hoc  sequence  of 
growth  operations  to  generate  a  random  pattern,  since  the  depen¬ 
dence  of  the  properties  of  the  images  generated  on  the  nature  of 
the  underlying  operations  is  not  obtained.  This  makes  the 
approach  unsuitable  for  texture  description  or  classification. 
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A  class  of  models  called  mosaic  models,  based  upon 
random,  planar  pattern  generation  processes,  have  been  con¬ 
sidered  by  Ahuja  [2, 3, 4, 5],  Ahuja  and  Rosenfeld  [6]  and 
Schachter,  Davis,  and  Rosenfeld  [56].  Schachter  and  Ahuja 
[55]  describe  a  set  of  random  processes  that  produce  a  variety 
of  interesting  piecewise  uniform  random  planar  patterns  having 
regions  of  different  shapes  and  with  different  relative  place¬ 
ment.  These  patterns  are  analyzed  for  various  geometrical  and 
topological  properties  of  the  components,  and  for  the  pixel 
correlation  properties  in  terms  of  the  model  parameters  [3,4, 
5,6].  Given  an  image  and  various  feature  values  measured  on 
it,  the  relations  obtained  above  are  used  to  select  the  appro¬ 
priate  model. 

The  syntactic  model  of  Lu  and  Fu  [34]  discussed  earlier 
can  also  be  interpreted  as  a  region  based  model,  if  the 
subpattern  windows  are  viewed  as  the  primitive  regions. 

We  may  point  out  that  although  the  model  used  by  Nahi  and 
Jahanshahi  [43]  and  Nahi  and  Lopez-Mora  [44]  discussed  earlier 
is  pixel  based,  the  function  y  carries  information  about  the 
borders  of  various  regions.  Thus,  under  the  constraint  that 
all  regions  except  the  background  are  convex,  the  model  can 
also  be  interpreted  as  a  region  based  model. 


4. 


Discussion 


Region  based  models  are  inherently  more  powerful  than 
pixel  based  models.  For  the  case  of  images  on  grids  this  is 
easy  to  see.  Consider  a  subpattern  that  consists  of  a  single 
pixel.  The  region  shapes  are  thus  trivially  specified.  It 
is  obvious  that  the  region  characteristics  and  their  relative 
placement  rules  can  be  designed  so  as  to  mimic  the  pixel  and 
joint  pixel  properties  of  a  pixel  based  model,  since  both  have 
control  over  the  same  set  of  primitives  and  can  incorporate 
the  same  types  of  interactions.  This  shows  that  region  based 
models  are  at  least  as  powerful  as  pixel  based  models.  On 
the  other  hand  if  we  are  dealing  with  images  that  are  struc¬ 
tured,  i.e.  that  have  planar  clusters  of  pixels  such  that 
pixels  within  a  cluster  are  related  in  a  different  way  than 
pixels  across  clusters,  then  we  must  make  such  a  provision  in 
the  model  definition.  Such  a  facility  is  unavailable  in  pixel 
based  models,  whereas  the  use  of  regions  as  primitives  serves 
exactly  this  purpose.  It  should  also  be  pointed  out  that 
region  based  models  appear  to  be  more  appropriate  for  the 
representation  of  natural  textures,  which  do  usually  consist 
of  regions. 

Many  texture  studies  are  basically  technique  oriented  and 
describe  ad  hoc  texture  feature  detection  and  classification 
schemes  which  are  not  based  upon  any  underlying  model  of  the 
texture.  We  do  not  discuss  these  here;  see  [19,41,59]  and 
the  references  ^herein. 
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